Value Function Discovery in Markov Decision Processes With Evolutionary Algorithms

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Value-Function Approximations for Partially Observable Markov Decision Processes

Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a set of imperfect or noisy observations. The modeling advantage of POMDPs, however, comes at a price — exact methods for solving them are computationally very...

متن کامل

The value functions of Markov decision processes

We provide a full characterization of the set of value functions of Markov decision processes.

متن کامل

Simulation-Based Algorithms for Markov Decision Processes

Title of Dissertation: Simulation-Based Algorithms for Markov Decision Processes Ying He, Doctor of Philosophy, 2002 Dissertation directed by: Professor Steven I. Marcus Department of Electrical & Computer Engineering Professor Michael C. Fu Department of Decision & Information Technologies Problems of sequential decision making under uncertainty are common in manufacturing, computer and commun...

متن کامل

Markov Decision Processes: Concepts and Algorithms

Situated in between supervised learning and unsupervised learning, the paradigm of reinforcement learning deals with learning in sequential decision making problems in which there is limited feedback. This text introduces the intuitions and concepts behind Markov decision processes and two classes of algorithms for computing optimal behaviors: reinforcement learning and dynamic programming. Fir...

متن کامل

Adaptive Algorithms for Markov Decision Processes

1. O8aK ^k3UhjaxO, UWhjHN(*JuVd\ r==9k?JhjaxNlDG"j, =N<or+ $?NOBellman[4]G"k. “^k3U"?”d“0*W h!”N-<o<IO#|GOkw_<$bNG"k. Bellman[3, 4]dHoward[9]KhkxqO?Jhjax H7FbGk=G-kdjr7&}-$,nN&f/8 KFAr?(?. ^k3UhjaxO=N==5lk0 *79F`N=$+i}WJXdN(*JG,)f, , ~)fJINdjH7Fj0=5l, 3sTe<?NJ bb<$0*Wh!N$ofkV!5Nv$(Curse of Dimensionality)WNn~K;,+1M)N=N&f,n GbV/=X,(Reinforcement Learning)WHFPlk Ke<m&@$J_C/Wm0i_s0KhkX,"k4 j:`N&f,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Systems, Man, and Cybernetics: Systems

سال: 2016

ISSN: 2168-2216,2168-2232

DOI: 10.1109/tsmc.2015.2475716